-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bugfix] add input embedding #11684
base: main
Are you sure you want to change the base?
[Bugfix] add input embedding #11684
Conversation
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
0429875
to
8c7751d
Compare
Thanks for opening this PR, can you explain what this PR is about and how this is related to #11375? |
@DarkLight1337 Sorry for referencing the incorrect issue number. Please refer to the following issues for |
5963444
to
f70bbb3
Compare
vllm/model_executor/models/qwen2.py
Outdated
@@ -450,6 +450,7 @@ def __init__(self, *, vllm_config: VllmConfig, prefix: str = ""): | |||
else: | |||
self.lm_head = ParallelLMHead(config.vocab_size, | |||
config.hidden_size, | |||
True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will break models that don't have bias weights. Can you read this from the HF config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx, I'll fix it later.
f70bbb3
to
cdabfaf
Compare
adds support for passing prompt_embeds to LLM.generate as
or
this enables use cases when only the embedding layer is finetuned, and have the same model backend support multiple custom tuned embedding layers
FIX #416
FIX #8323